Aiming at that high-dimensional data is hard to be understood intuitively, and cannot be effectively processed by traditional machine learning and data mining techniques, a new method for nonlinear dimensionality reduction called Discriminant Diffusion Maps Analysis (DDMA) was proposed. It was implemented by applying a discriminant kernel scheme to the framework of the diffusion maps. The Gaussian kernel window width was selected from the within-class width and the between-class width according to discriminating sample category labels, it made kernel function effectively extract data correlation features and exactly describe the structure characteristics of data space. The DDMA was used in artificial Swiss-roll test and penicillin fermentation process, with comparisons with Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), Kernel Principle Components Analysis (KPCA), Laplacian Eigenmaps (LE) and Diffusion Maps (DM). The results show that DDMA represents the high-dimensional data in a low-dimensional space while successfully retaining original characteristics of the data; in addition, the data structure features in low-dimensional space generated by DDMA are superior to those generated by the comparison methods, the performance of data dimension reduction and feature extraction verifies effectiveness of the proposed scheme.
In this paper, aiming at the priority selection of the Gaussian kernel parameter (β) in the Kernel Principal Component Analysis (KPCA), a kernel parameter discriminant method was proposed for the KPCA. It calculated the kernel window widths in the classes and between two classes for the training samples.The kernel parameter was determined with the discriminant method for the kernel window widths. The determined kernel matrix based on the discriminant selected kernel parameter could exactly describe the structure characteristics of the training space. In the end, it used Principal Component Analysis (PCA) to the decomposition for the feature space, and obtained the principal component to realize dimensionality reduction and feature extraction. The method of discriminant kernel window width chose smaller window width in the dense regions of classification, and larger window width in the sparse ones. The simulation of the numerical process and Tennessee Eastman Process (TEP) using the Discriminated Kernel Principle Component Analysis (Dis-KPCA) method, by comparing with KPCA and PCA, show that Dis-KPCA method is effective to the sample data dimension reduction and separates three classes of data by 100%,therefore, the proposed method has higher precision of dimension reduction.